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Abstract —The secrecy rate represents the amount of informa¬ 
tion per unit time that can he securely sent on a communication 
link. In this work, we investigate the achievable secrecy rates 
in an energy harvesting communication system composed of a 
transmitter, a receiver and a malicious eavesdropper. In particu¬ 
lar, because of the energy constraints and the channel conditions, 
it is important to understand when a device should transmit 
and to optimize how much power should be used in order to 
improve security. Both full knowledge and partial knowledge of 
the channel are considered under a Nakagami fading scenario. 
We show that high secrecy rates can be obtained only with 
power and coding rate adaptation. Moreover, we highlight the 
importance of optimally dividing the transmission power in the 
frequency domain, and note that the optimal scheme provides 
high gains in secrecy rate over the uniform power splitting case. 
Analytically, we explain how to find the optimal policy and prove 
some of its properties. In our numerical evaluation, we discuss 
how the maximum achievable secrecy rate changes according 
to the various system parameters. Furthermore, we discuss the 
effects of a finite battery on the system performance and note 
that, in order to achieve high secrecy rates, it is not necessary 
to use very large batteries. 

Index Terms —energy harvesting, secrecy rate, physical layer 
security, WSN, MDP, optimization, policies, finite battery. 

I. Introduction 

ECURITY and privacy are becoming more and more im¬ 
portant in communications and networking systems, and 
have key applications in the Wireless Sensor Network (WSN) 
and Internet of Things (loT) world 0. While most works in 
this area deal with security protocols 0. 0. implementing 
security mechanisms at the physical layer represents an inter¬ 
esting complement to those networking approaches 0, and 
has the potential to provide stronger (information-theoretic) 
secrecy properties 0. 

In the context of energy-constrained and green networking, 
the design of low-power systems and the use of renewable 
energy sources in network systems are prominent areas of 
investigation. In particular, the use of Energy Harvesting (EH) 
technologies as a way to prolong unattended operation of a 
network is becoming more and more appealing. However, de¬ 
spite these trends, security and privacy issues so far have been 
addressed mostly by neglecting low-power design principles 
(except possibly for some attempts at limiting the computation 
and processing costs and/or the number of messages needed 
to implement a secure protocol). In particular, the impact 
of power allocation policies and of system features related 
to energy harvesting has only been studied in some special 
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cases 0, 0. Since green aspects will play an increasingly 
large role in future networks, it is essential to bring low-power, 
energy-constrained and green considerations into this picture. 
In this paper, we try to partly fill this gap, studying how the 
use of energy harvesting affects the design and performance 
of physical layer security methods. 

We consider an Energy Harvesting Device (EHD) (i.e., 
a device with the capability of gathering energy from the 
environment 0, e.g., through a solar panel or a rectenna) 
that sends data to a receiver over an insecure communication 
channel. The goal is to transmit data securely, i.e., in such 
a way that an adversary (or eavesdropper) with access to 
the communication link is not able to gather useful infor¬ 
mation about the data sent. We study how the specific EH 
characteristics influence the achievable secrecy rate (i.e., the 
information rate at which the EHD can reliably send data to 
the receiver while keeping it secret from the eavesdropper). 
Deciding whether the EHD should transmit or not, how much 
power should be transmitted or how to divide the power 
among the different sub-carriers is not obvious, and all these 
aspects need to be appropriately optimized. Moreover, while 
in the classic throughput optimization problem if the available 
resources were used improperly the corresponding penalty 
would be a performance reduction, in the secrecy optimization 
problem an improper use of the resources may imply not only 
a reduced transmission rate, but also a security loss, possibly 
making sensitive data accessible to a malicious party. 

In the literature, many papers studied energy harvesting 
communication systems because of their ability to increase 
the network lifetime, provide self-sustainability and, ideally, 
allow perpetual operations m- E) presented a survey on 
the several different environmental energy harvesting tech¬ 
nologies for WSNs. Analytically, GD formulated the problem 
of maximizing the average value of the reported data using 
a node with a rechargeable battery. In | [T3| , | fT4) , Sharma et 
al. studied heuristic delay-minimizing policies and sufficient 
stability conditions for a single EHD with a data queue. Ozel et 
al. set up the offline throughput optimization problem from an 
information theoretic point of view in GD, where they derived 
the information-theoretic capacity of the AWGN channel and 
presented two schemes that achieve such capacity (save- 
and-transmit and best-effort-transmit). In fT^ , the authors 
also modeled a battery-less system by a channel with state 
dependent amplitude constraints and causal information at the 
transmitter, and derived the capacity of this channel by making 
use of a result by Shannon. The throughput optimization 
problem with finite batteries in an EH system was studied 

in ini, urg. 

Security aspects have been widely studied in the WSN 



literature 0, 0, 0. Examples of relevant applications in a 
WSN/IoT context include health-care monitoring p0| , pT) , 
where the sensitive data of patients may be exposed to a 
malicious party, or military use | [22) , | [23] , where a WSN can 
be used for monitoring or tracking enemy forces. In particular, 


security is considered the strongest notion of security |25| 


sub-carriers in an EH system with secrecy constraints. In |36| 
the authors employed a physical layer secrecy approach in a 
system with a transmitter that sends confidential messages to 
a receiver and transfers wireless energy to energy harvesting 
receivers. Our focus is substantially different from those: in the 


in addition to higher layers |24|, that are relatively insensitive 
to the physical characteristics of the wireless medium, phys¬ 
ical layer can be used to strengthen the security of digital 
communication systems and improve already existing security 
measures. The basic idea behind the concept of physical layer 
secrecy is to exploit the randomness of the communication 
channel to limit the information that can be gathered by 
the eavesdropper at the signal level. Through channel coding 
techniques, it is possible to simultaneously allow the legit¬ 
imate receiver to correctly decode a packet and prevent a 
potential third party malicious eavesdropper from decoding 
it and thus provide information-theoretic or unconditional 
security. Differently from computational security methods, 
that are based on the limited computational capabilities of 
the adversary (as in a cryptographic system), unconditional 



Figure 1: Block diagram of the system, g and A are the channel gains and p 
represents the power allocated over the N sub-carriers. 


present paper we consider an EHD that harvests energy from 
an external, non-controllable and renewable energy source. 
Our goal is to maximize the achievable secrecy rate, i.e., to 
define how to correctly exploit the available (random) energy 
according to the device battery dynamics. 

Our main contribution lies in the definition of a new 


because no limits on the adversary’s computing power are 
assumed. Perfect secrecy 0 is achieved when there is zero 
mutual information between the information signal, s, and the 
signal received by the eavesdropper, z, i.e., J(s; z) = 0 and z 
is useless when trying to determine s. In p^ , Wyner showed 
that if the eavesdropper’s channel is degraded with respect to 
the legitimate channel, then it is possible to exchange secure 
information at a non-zero rate while keeping the information 
leakage to the eavesdropper at a vanishing rate. This result 
was extended in | |27| for non-degraded channels provided the 
eavesdropper channel is not less noisy than the legitimate 
channel. In | |28) , the secrecy capacity of fading channels in 
the presence of multiple eavesdroppers is studied. It was 


shown in |29| that in a fading scenario it is also possible 
to obtain a non-zero secure rate even if, on average, the 
eavesdropper’s channel is better than the legitimate one. The 
authors also established the importance of variable rate coding 
(i.e., matching the code rate to the channel rate) in enabling 
secure communications. In pO) , the authors compute the 
secrecy capacity of a MIMO wiretap channel with one receiver 
and one eavesdropper and an arbitrary number of antennas. 
A survey of physical layer security in modern networks is 
presented in 

The secrecy capacity paradigm in an energy harvesting com¬ 
munication system was studied in p^ , p^ , where the authors 
considered the case of a batteryless transmitter and found 
the rate-equivocation region. p4| studied the deployment of 
an energy harvesting cooperative jammer to increase physical 
layer security. In 0 the authors presented a resource allocation 
algorithm for a multiple-input single-output secrecy system 
for a communication system based on RE energy harvesting. 
Also p5j studied how to efficiently allocate power over several 


practical and challenging problem. As in |32|, p3| , we inves¬ 
tigate the physical layer secrecy in an EH system. However, 
differently from those papers, we explicitly consider the effects 
of a finite battery and we focus on finding the transmission 
strategy that maximizes the secrecy rate, namely the Optimal 
Secrecy Policy (OSP). Since in a WSN the devices operate 
under the same conditions for long periods, the steady-state 
regime is generally reached, and thus we focus on the long¬ 
term optimization. Similarly to in), pg, we set up an 
optimization problem based on a Markov Decision Process 
(MDP) approach but, unlike in those works, we focus on 
the security aspects, considering the presence of a malicious 
eavesdropper and a generic number of sub-carriers. Thus, 
even if the proposed analytical framework is similar to those 
provided in the literature, since additional dimensions are 
considered, the optimization process is more challenging and 
different considerations and insights are derived. In particular, 
we prove several properties of OSP and describe a technique 
to compute it by decomposing the problem into two steps. 
We specify how to allocate the power over the different 
sub-carriers and remark that a smart power splitting scheme 
is important to achieve high secrecy rates. As in p9) , we 
consider several degrees of knowledge of the channel state 
information, describing both variable and constant rate coding 
techniques and discussing how the achievable secrecy rate 
changes in these cases. However, unlike | |29) , we study an 
energy constrained system with N parallel sub-carriers, and 
accordingly formulate and solve an optimization problem to 
determine the maximum secrecy rate. Therefore, our paper 
considers aspects that either have not yet been considered or 
have been separately studied in the literature, and represents 
an advancement of the state of the art in the important areas of 
green networking and security, leading to novel insights about 
the interaction of many different system design aspects. 

The paper is organized as follows. Section |I^ defines the 
system model we analyze and introduces the notion of secrecy 
rate. In Section [TO we introduce the secrecy rate optimization 
problem. SectionjlV| describes how to find OSP and some of 
its properties with full CSI. In Section [V] we study the case of 
imperfect CSI knowledge. Section [TO presents our numerical 


results. Einally, Section VII concludes the paper. 

















II. System Model and Secrecy Rate 

We consider an Energy Harvesting Device (EHD) that 
simultaneously transmits data in a wide frequency band com¬ 
posed of N narrow bands. The transmission power can be 
different for every sub-carrier. The transmission model can be 
described as a set of N parallel Gaussian wiretap channels, 
affected by independent fading, as in p8) . The goal of the 
transmitter is to send data to the legitimate receiver with a 
positive secrecy rate in order to guarantee secure transmission. 
An eavesdropper attempts to intercept the transmitted data (see 
Figure for the block diagram of the system model). 

We initially assume that the EHD knows the Channel 
State Information (CSI) of all the sub-carriers toward the 
receiver and the eavesdropper instantaneously, and will relax 
this hypothesis in Section Time is divided into slots of 
equal duration T, chosen according to the channel coherence 
time, in order to guarantee constant channel gains in every 
slot. The EHD is equipped with a battery of hnite size e„iax 
and in slot k the device has S <S = {0, ..., Cmax} energy 
quanta stored]^ Knowledge of the state of charge is useful 
at the transmitter side only to determine when to schedule 
a transmission. The harvesting process is described through 
an energy quanta arrival process e.g., deterministic, 

Bernoulli or truncated geometric (for example, see p9) for a 
characterization of the light energy). The average harvesting 
rate is b, the maximum (minimum) number of energy quanta 
harvested per slot is 5max (^min), and a quantum harvested in 
slot k can only be used in time slots > k. We assume that 
the device always has data to send and that the energy cost 
that the device sustains is mainly due to data transmission. 
Extensions to more general models are left for future work. 

The channel gains in slot k are and 

^(fc) _ ..., for the N legitimate and eavesdropper 

sub-carriers, respectively, and can be interpreted as 
realizations of two jointly random vectors G = [Gi,..., G^r] 
and H = [Hi,... ,Hm] (i.i.d. over time) with supports Q 
and 3{. We assume that the receiver has complete CSI of its 
channel in order to decode the received signal. Instead, the 
eavesdropper has knowledge on every aspect of the system 
(this is a reasonable worst-case assumption, as the transmis¬ 
sion strategy should not rely on assuming the eavesdropper’s 
ignorance of any state). Nevertheless, we should point out that, 
for a passive eavesdropper, knowledge of the main channel 
state is totally immaterial. In the following, when we refer to 
“full” or “partial” CSI, we always refer to the transmitter side. 

A. Secrecy Rates and Capacity 

We refer to the notions of secrecy rate and secrecy capacity 
as known in the physical layer secrecy literature g, m 
and their ergodic counterparts in the fading scenario |40| . 
Specihcally, we dehne an (M, N, £) code for the parallel 
wiretap channel as consisting of: 1) a message set S with 

'while in reality energy is a continuous quantity, we decide to adopt an 
approximate approach and discretize it in order to simplify the numerical 
optimization and apply the discrete MDP theory. However, we remark that it 
is always possible to use a finer quantization in order to improve the accuracy 
of the discrete approximation (which however implies higher complexity). 


cardinality M, 2) a probabilistic encoder at the transmitter 
that maps each message s G S (realization of the r.v. S) to 
each N X i codeword x G OC^, with X = Xi x • • • x Xjv 
according to some conditional distribution Pxis('tIs), and 3) a 
(deterministic) decoder at the legitimate receiver that extracts 
s (realization of the r.v. S) from the received message y G , 
where y = IJi x ■ ■ ■ x i.e., -.y^^S. 

The average error probability of an (M, N, i) code is given 
by 

Perr=^J2^{S^s\S = s). (1) 

ses 

The equivocation rate at the eavesdropper is = 

{l/i)H{S\Z^), i.e., the conditional entropy rate of the trans¬ 
mitted message given the eavesdropper’s channel output Z^. 

represents the level of ignorance on the target secret 
message at the eavesdropper. Perfect secrecy (unconditional 
security) would be obtained if = R^, where R^ = 
{1/£)H{S) is the secret message rate. However, this is not 
possible in general with wiretap coding techniques, so we must 
settle for a weaker requirement, that holds asymptotically. 
Therefore, a secrecy rate Rg is said to be achievable if there 
exists a sequence of {2^^%N,£) codes, £ = 1 , 2 ,..., such 
that 

lim = 0, Rs< lim Ri (2) 

£—>•00 £—>-oo 

and the secrecy capacity is dehned as the supremum of the set 
of achievable secrecy rates. 

B. Coding Strategy 

The transmitter coding strategy influences the secrecy rate. 
In particular, in this paper we consider constant and variable 
rate coding dehned as follows (a construction procedure for 
these codes can be derived as explained in p9] Theorems 1 
and 2]). 

Variable rate coding consists in adapting the code rate 
to the main channel state. This can be accomplished by 
constructing a separate codeword x for every realization of 
the channel, i.e., x = a;(current channel). In this case, in 
every slot k and on every sub-carrier r = l,...,iV the 
transmitter observes the channel and picks the symbols to be 
transmitted from the current codeword x{gi^'^). We study the 
long-term regime and thus we consider the case of inhnite 
length codewords. With variable rate coding, when the gain 
of the legitimate channel in a given sub-carrier is g, the 
transmitter uses symbols from codewords at rate log(l + gp) 
(where p is the transmission power, which will be the objective 
of our optimization). To achieve such a rate, it is required 
to use a codeword specifically designed for this channel, i.e., 
x{g). Then, if the eavesdropper’s channel gain is ^ > p, thanks 
to the chosen coding rate, the mutual information between 
the transmitter and the eavesdropper is upper-bounded by 
log(l + gp). Instead, when fi < g, the mutual information 
becomes log(l-|-/ip) (Shannon’s theorem). We can summarize 
the two previous cases as log(l-|-min{p, li}p). Therefore, even 
if h > g, the eavesdropper does not receive more information 
than the legitimate receiver (they both experience the same 


rate log(H- gp))- In the long run, the average rate of the main 
channel and the information accumulated at the eavesdropper 
are 


and 


lim inf 
K—¥CC> 


1 

K + 1 


K N 

^5]log(l + ^tWp) 

fe=0 r=l 


(3) 


lim inf 

K^oo 


1 

K + 1 


K N 

EE log(l + min{pW,/i«}p), 

fc=0 r=l 


(4) 


respectively. In this case, by constructing a code and the 
corresponding coding map, the long-term secrecy rate (amount 
of secret information that can be sent) is 


lim inf 

K^oo 


1 

K + l 


K N 

EE( log(l -f 

k—Qr—1 


-log(l + min{< 7 W,/i«}p)). 


(5) 


Constant rate coding consists in keeping the code rate con¬ 
stant, regardless of the legitimate and eavesdropper’s channel 
states. In this case, a single codeword x is used in every fading 
condition. In every slot, the transmitter picks the symbols to 
be transmitter from the only available codeword x. In the long 
run, since we consider infinite length codewords, x spans the 
entire fading statistic of the channel. With constant rate coding, 
regardless of the current channel state, the transmitter uses 
codewords at a fixed rate i?con such that i?con > log(l -f gp) 
for every g and p. In this case, if the current legitimate channel 
is g, the mutual information between transmitter and receiver 
is upper bounded by Shannon’s theorem as log(l -f gp). 
Similarly, the mutual information between transmitter and 
eavesdropper is given by log(l The secrecy rate can be 

expressed as 


K N 


lim ^EE (log(l + p«p)-log(l+/i«p)) 


k—{) r—1 


( 6 ) 


where [•]+ = max{0, •} is used to obtain a non-negative 
rate. Note that (|^ is lower than (or equal to) (|^, i.e., higher 
secrecy is achieved with variable rate coding. However, its 
implementation is more difficult as the code rate has to be 
changed frequently according to the legitimate channel state. 


For simplicity, in the next we use Rgn{p) to indicate the 
terms of the sum in (|^ if variable rate coding is considered, 
or in the constant rate coding case, i.e., 

R t t A + 5 'P) ' log(l+ min{p,/i}p), var. rate, 

^ |log(l-I-pp) — log(l-I-/ip), con. rate. 

(7) 

c{p,g,/l) is the generalization with a generic number of sub¬ 
carriers N: 

N 

c{p,g,li) = '^Rg^fi^{Pr), (8) 

r—1 


and p*°* is the corresponding total transmission power, defined 
as 

= InP- (9) 

The value of c{p,g,fi) depends on the choice of the power 
allocation over the several sub-carriers, p = [pi,..., pn]’^, the 
channel conditions g and h, and the coding rate strategy. 1 m 
is a column vector consisting of N ones. In the general case, 
the choice of p that maximizes the secrecy rate, among those 
satisfying will in turn depend upon the channel conditions 
g and Fi. 


III. Optimization Problem 

The system state in time slot k is defined by the {2N-\- 
l)-tuple A policy p is a set of rules that, 

given the state of the system, specifies the power allocation 
over the N sub-carriers. 

In the long run, the average secrecy rate under a policy p 
is given by the average undiscounted reward 




lim inf 
K—¥CC> 


1 

K + l 




( 10 ) 


where c(-, *, •) is the instantaneous partial contribution defined 
in 0, SW is the power allocation vector defined by the 
polic}0and is the energy in the initial time slot. A secure 
communication can be performed if > 0. ( [T^ is a 

generalization of 0 and 0 for N sub-carriers and a dynamic 
transmission power. 

The battery evolution is as follows 


N 


= min ^ -f 


r—1 



( 11 ) 


where is the component of the vector and the 

min is used to account for the finite battery. Note that 
must satisfy Vfc and > 0, Vfc, Vr. 

Thus, Problem ( [TOl l is implicitly influenced by the evolution 
of because of 

Our aim is to solve the following maximization problem 
p* = arg max (12) 

A policy that solves © is an Optimal Secrecy Policy 
(OSP). In the next subsection we explain in more detail 
the optimization variables and the constraints of the above 
problem. 


A. Markov Decision Process Formulation 

Since we consider a long-term optimization, we recast the 
problem using a Markov Decision Process (MDP) formulation. 
In particular, we model our system by a Markov Chain (MC) 

^Given a temporal sequence of energy anivals and channel states, the policy 
/r can be applied to obtain the power allocation vector In this case we 

use a deterministic policy for presentation simplicity, and prove later that this 
choice is optimal. 











with a finite number of states. For every MC state (e, g, Fi), a 
power allocation policy p, is the set of rules 

VeG(S, WgeQ, (13) 


where p{-;e,g,Pi) is the conditional distribution (pmf) of the 
power allocation vector defined as follows 


MP; e.P. A) 4 P =* 

and, for every g, h, is subject to 


(14) 


pG5><(e) 

lJ-{p;e,g,ti)>0, ypG(P<{e), (15b) 

■^<(e) —{p : p^O n p‘°* = l^p < e| . (15c) 


(P< (e) is the set of all feasible vectors p when the energy level 
is e. The reward function becomes 

C^(E(°^)=J^7r^(elE(^y) (16) 

x/ c(p,P,^)p(p;e,p,A) dF(^,A), 

secrecy rate given the MC state (e,^,A) 

where 7 r^(e|ii^^°)) G [ 0 , 1 ] is the steady-state probability of 
having e energy quanta stored starting from state E^^^ under 
a policy p and F{g,/l) is the joint cumulative distribution 
function of G and H. (e|F;(o)) summarizes the battery 
evolution and is evaluated according to •HD- The optimization 
variables in Problem ( [T2] i are the pmfs p{--,e,g,ti). Also, it 
can be shown (see Section |IV-A| i that an OSP which admits 
steady-state distribution always exists. Therefore, without loss 
of optimality, we decided to restrict our study to the class 
of policies with steady-state distribution. For these policies, 
since we focus on the average long-term optimization, is 
equivalent to ( fTO) !. 

It is possible to separate p into the product of a transmit 
power policy, which specifies the conditional distribution of 
the total transmission power given the current state, namely 
7 At(p*°*; e, g, h), and the conditional distribution of the power 
allocation given the total transmission power and the current 
state, namely (/)^(p; e, p, ^); 

p(p;e,p,A) = (j)^,{p-,p^°\e,g,Fi)y^{p^°^-,e,g,h). (17) 


The above expression will be useful to decompose the 
problem into two steps in Theorem 

We highlight that p performs a power control mechanism, 
i.e., it specifies how much power is used in every MC state 
but, in addition to power control, also the code rate can be 


changed according to Section II-B 


B. Finite Model 

In the previous subsection, we assumed that the policy can 
be defined for every possible value of the channel gains. This 
can be done by simple enumeration if |Cj’| < oo and \3{\ < oo. 
However, the channel gains may be continuous variables in the 


general case. Instead of defining a policy for a continuously 
infinite set of values, we want to find a set of points where the 
policy can be computed and optimized efficiently. The follow¬ 
ing approach can be followed. Consider the random variable 
Gi (for the others the reasoning is similar). We discretize the 
support of Gi in n intervals with an equally likely strategy 
(P(Gi G [pi,pi+i)) = 1/n, i = Then, we specify 

the policy in the centroid of every interval. If the number of 
intervals n is sufficiently large, the approximation is very close 
to the continuous case. 

Remark 1. Since we consider a discrete channel, we focus 
without loss of generality on channel conditions with non-zero 
probability, i.e., P(G — g, H — A) > 0, \fg G Q,A G !E. 


IV. Optimal Secrecy Policy with Complete CSI 


In this section we study the case when the transmitter has 
perfect CSI knowledge, and introduce a technique to compute 
OSP and some of its properties. All our results are useful to 
simplify the numerical evaluation. In particular; 1) we prove 
that there exists a deterministic OSP (Theorem [^i; 2) we 
propose a technique to derive a unichain OSP (Section [IV-A i; 
3) we decompose the optimization process in two steps (Theo¬ 
rem]^; and 4) we show that the transmission power increases 
(decreases) with the channel gain of the legitimate receiver’s 
(eavesdropper’s) sub-carriers (Theorem [^. 

Theorem 1. There exists a deterministic OSP, i.e., an optimal 
secrecy policy in which, for every MC state {e,g,A) 


P*{P',e,g,A) 


1 . ifP = Plgfi, 

0 , otherwise, 


(18) 


for some Pegh depending upon the current MC state in 
general. 

Proof: See Appendix ^ ■ 

By exploiting Equation ( [TtI i, it also follows that ^ such 
that the transmit power policy 7 ^ defined in © satisfies 





if = p] 


e,gA^ 


otherwise. 


(19) 


Definition 1 (Deterministic Policy). Since a deterministic OSP 
always exists, we only need to study deterministic policies, thus 
p can be redefined as 


P — {Pe.g.A C ■^<(e)i Ve G <S, yg & Q, A G 7f}. (20) 

Pe,g,ti = [Pl■,e,g,li^ ■ ■ ■ , P 7 V;e,^,/i] characterizes the transmission 
powers on different sub-carriers in state {e,g,A). 

We also introduce the sub-policy as 

= {pZa^ yeGS,ygGQ, Ag ?f}, ( 21 ) 

which accounts for the total transmission powers only. p^°^ 
and p are consistent if the sum of the elements of Pe,g./i in p 
is equal to ^ in ye G &,g G Q,A G X. 

The deterministic property is particularly useful to simplify 
the numerical evaluation because a policy needs to define 
only a scalar value for every state of the system and not a 
probability distribution. 










A. Unichain Policies 

We restrict our study to the class of unichain policies, i.e., 
those that induce a unichain MC (i.e., a MC with a single 
recurrent class). This is useful in order to apply the standard 
optimization algorithms in the next section. 

Some sufficient conditions to obtain a unichain policy are 
presented in the following proposition (in this subsection we 
use deterministic policies for presentation simplicity, but the 
results can be easily extended). 

Proposition If a policy satisfies one of the following 
conditions, then it is unichain. If it satisfies both conditions, 
the policy induces an irreducible, positive recurrent MC. 

1) For every e G <S\{emax} there exists a pair 

such that < hinax (maximum number of energy 

arrivals). 

2) For every e G <S\{0} there exists a pair such 

that P^ g/i f^ir > ^min- 

Proof: See Appendix ■ 

In practice, the hrst and second points ensure that there 
is a positive probability that the battery moves from level e 
to higher and lower energy levels, respectively. When they 
are both verihed, no transient state can exist, and the MC is 
irreducible. 

When at least one point of Proposition is satished, the 
corresponding policy is guaranteed to be unichain. However, 
in general, these conditions may not be satished and a policy 
may not be unichain. In addition, there may exist more than 
one policy with the same maximum achievable secrecy rate 
(the highest secrecy rate among (7^(0),..., C^(emax))- Some 
of these are unichain, whereas others are not. Consider the 
following example to justify these claims. 

Example 1. We want to show a case in which 1) multiple 
policies with the same maximum reward exist and 2) some of 
them are not unichain. 

Assume that the harvesting process is deterministic and 
equal to fomax < Cmax/S, = 1, and the channel is constant 
gi > fii. Consider the following policies 

Pi = {pi-e,gi,rii = min{e,6niax}, Ve, Vpi, /ii}, 

{ Pl;e,gi,/ii ~ ^ ~ t^maxj 

Pl;e,,gi,/ii — ^max: — ^max; 

Pi;e,gi/ii = 0, Otherwise 

/ii is a unichain policy (the recurrent class is the bat¬ 
tery level {&max}) that provides a long-term secrecy rate 
c(6max,, 91 ,/ii)- Instead, p .2 is not unichain (the two recurrent 
classes are {&max} and {cmax - Vax, Cmax}) and its long¬ 
term secrecy rate depends upon the initial state (it can be 
c(fomax,5'i,/*i) or 0.5c(26inax, Also, note that be¬ 

cause of the concavity of Equation c(^max, Pi, ^ 1) ^ 
0.5c(2&rnax,Pi,/ii)- Therefore, there exist more than one pol¬ 
icy with the same maximum achievable reward c(braax, Pi , Hi)- 
Moreover, in g, 2 > there are two recurrent classes, and thus it 
is not unichain. 

This example shows that the long-term secrecy rate for a 
non-unichain policy may depend upon the starting state. Also, 



it shows that in general there may exist different policies, 
unichain and not unichain, with the same maximum achievable 
secrecy rate. The following proposition establishes that there 
is no loss in generality in considering only unichain policies. 


Proposition 2. Given a generic policy, it is always possible 
to derive another policy which is unichain and attains the 
same maximum achievable secrecy rate as the original policy, 
regardless of the initial state. 

Proof: We provide a constructive proof in Appendix ■ 


In the rest of the paper we always refer to unichain policies, 
for which is independent of E^^'> |411. In particular. 

Proposition]^ holds for the optimal secrecy policies, i.e., there 
always exists a unichain OSP, and therefore we will focus on 
unichain policies with no loss in optimality. Note that, since 
we consider a hnite MC (we discretized both the battery level 
and the channel gains), a unichain policy always implies the 
existence of a steady-state distribution as in Equation ([T6]l. 


B. Computation of OSP 

We now want to simplify the expression of by exploiting 
the results we have found so far. If g and /i*°* are consistent, 
the long-term secrecy function can be rewritten as 

specified by fi 

= V TT^tot (e) / c(p^,g,E) dF{g,E). (22) 

ees 

An interesting fact is that the steady-state probability 
TT^tot (e) depends upon the sub-policy only. This is 

because 7r^tot(e) describes the battery energy evolution, that 
depends only upon the total energy consumption in a slot, not 
upon the particular power splitting scheme. This result leads 
to the following theorem. 

Theorem 2. The maximization of can be decomposed into 
two steps: 

1) fix a value x and the channel gain vectors g, B and find 
the optimal power splitting choice 

p* = arg max c(p, g, B), (23a) 

P 

s.t: pe'75=(a;) = {p : p ^ 0, a; = ; 

(23b) 


2) maximize by considering only 

= arg max (7^, (24a) 

,,tot 

s.t.: and g are consistent, (24b) 

Pe,gA solves @ with X = pl°^gj., 

VeeS.VpSg, MBGdt. ^ ^ 

The optimal g can be found by fixing p according 
to point 2) and choosing p with the optimal power splitting 
choice of point 1). 

Proof: See Appendix ■ 

The optimal power splitting choice p* that solves \23\ can 
be found with a Lagrangian approach (for further details, see 



d 


Theorem 1 and Equation (7) in ||29)): 


Pr 


A 

ar = 


2 


fir 


1 

gr 


fir 


1 

gr' 


(25) 


(26) 


where 77 is a parameter used to satisfy x = Pr- the 

remainder of the paper we assume that this optimal power 
splitting choice is used, unless otherwise stated. We highlight 
that OSP yields p* = 0 if gr < fir, which implies that 
the achievable secrecy rate with complete CSI is independent 
of the coding scheme (the two expressions in Equation Q 
coincide). 

To solve Step 2) instead, the Optimal Secrecy Policy can 
be found numerically via dynamic programming techniques, 
e.g., using the Policy Iteration Algorithm (PIA) @ 0 PIA 
alternates between a value determination phase, in which 
the current policy is evaluated, and a policy improvement 
phase, in which an attempt is made at improving the current 
policy. Policy improvement and evaluation can be performed 
in ©((eniax)^ri^^) and ©((emax)^) arithmetic operations, 
respectively, where ©(•) is the standard asymptotic notation. 
This result is derived as follows. Eor every state of the system 
(smax X X Ti^), the policy improvement step requires 
to find the best transmission power (which is ©(cmax)) to 
reach every other battery level (e„iax)- Instead, the ©((cmax)^) 
performance of the policy evaluation step is due to a matrix 
inversion cost (which can be reduced to ©((emax)^'^^^) using 
Coppersmith-Winograd like algorithms). The previous two 
steps are performed iteratively until the optimal policy is 
found, which, in general, requires few iterations (< 10 ). 
Therefore, PIA has a polynomial complexity in the number 
of states of the system. 

Note that Theorem with (|25]l-(|26ll decompose the opti¬ 
mization into two steps. Therefore, the numerical evaluation 
only requires to study the two points separately instead of 
performing a (more computationally intensive) bi-dimensional 
optimization. 

We also remark the following. 


Lemma 1. By restricting the study to the unichain policies 
constructed as in Appendix OSP is uniquely determined. 

Proof: In all the transient states, by construction (Ap¬ 
pendix |C|, we have a ~ recurrent states. 


thanks to [42. 
determined. 


Vol. II, Sec. 4], we know that ^ is uniquely 


C. Properties 

We now derive a property that is useful to understand when 
the transmission power increases or decreases. 

Proposition 3. Consider two channel states g', fi and g", 
fi" and define 

D{p^°^-,g',Ii'-,g'',fi") (27) 

key assumption of PIA is that, at every algorithm step, a unichain 
policy is produced. In order to satisfy this condition, we apply the technique 
of Appendix [C| 


dp^' 




c{P, 


fi.g'A'’ 



where p* ,, and p* are defined as the of 

Problem ( |23] l with x = 

OSP has the following trend 


if D{p^°^-,g',B'-g",h") > 0, then p\°f„ > 

„tot* . 

Pe,g' fi’’ 

if D{p^°^-,g',fi'-,g",fi'') < 0, then < 


tot' 

Pe.g'fi' 


Proof: See Appendix 


In practice, it is better to use more energy in the directions 
where the function c(-,-,-) increases. A consequence of the 
previous proposition is derived in the following theorem. 


Theorem 3. Consider N = 1. The transmission power of OSP 
is non-decreasing with g and non-increasing with fi (we omit 
the “1” subscripts). Formally 


fg">g 


I then 


Proof: See Appendix 


This is an expected result, i.e., when the legitimate channel 
improves, then it is reasonable to use more energy in order to 
get a higher rate. Conversely, when the eavesdropper’s channel 
improves, it is better not to use a lot of energy because only 
low rates can be obtained. In this case, it is better to conserve 
energy and wait for a better slot. The previous theorem is 
useful to prune the action space in the numerical computation: 
if we found the optimal transmission power for a given channel 
state, we could exploit it as lower [upper] bound for better 
[worse] channel states. 

We expect that a result similar to Theorem holds for 
a generic N > 1. A formal proof would require to ex¬ 
plicitly compute D{p^°^-,g',FL-,g",(i') and show that it is 
non-negative or non-positive (see Appendix]^. However, this 
would require the computation of an analytical expression for 
f] in Equation ( [25] l. Even though this is in principle possible for 
any fixed N, the corresponding expression is very complicated 
and, in practice, the resulting D{p^°^-, g', ft' g", h") is too long 
to be analytically tractable. 


V. Optimal Secrecy Policy with Partial CSI 

In the previous sections we assumed that the realizations of 
G and H, namely g and fi, are known at the transmitter. This 
may not be true in practice. In particular, it is likely that, since 
the eavesdropper does not cooperate with the transmitter, its 
channel gain is unknown. In this section we gradually remove 
these assumptions and discuss how the achievable secrecy rate 
changes as a result. 

We assume that G = [Gi,..., G^v] and H = [Hi ,..., 
have independent components and are independent of each 
other. In this section we assume that all links are affected by 
i.i.d. Nakagami fading. This means that the amplitude of a 

'^Note that p* and p* depend upon 








received signal has a Nakagami pdf with parameters m and 
n, i.e., 

( 77? \ 1 m 2 

— ) , a; > 0, (28) 

n / i (mj 

poc 

r(m) = / dt. ( 29 ) 

Jo 


The secrecy rate expression becomes 

e^O 
N 

X f[ (^fGA9r)fHAM) 

r=l 



1 “I” 9rPr\e^g 

1 + lirPr-,e,g 


( 33 ) 


Therefore, and 77^ exhibit a Gamma distribution. The 
pdf of Gr (with mean gr) is 


fcAs^fn) 



T{m) 




g G M+, 
m > 1 


(30) 


and similarly for Hr (for presentation simplicity, we assume 
that the legitimate receiver and the eavesdropper have the same 
index m, but the analysis can be extended to a more general 
case). Note that m = 1 corresponds to Rayleigh fading and 
foApA) = Ve”*''/®’’ is an exponential distribution. As m 
increases, the strength of the line of sight component increases. 
For ease of notation, in the remainder of the paper we drop the 
dependence on m and implicitly assume fcAd) — fcAo^ ”^)- 


A. Unknown Eavesdropper’s Channel 

In this section, we assume that both the legitimate and 
the eavesdropper’s channels are affected by fading but CSI 
is available only for G. In this case, due to this lack of 
information, it may happen that EHD transmits even when 
the eavesdropper’s channel gain is higher than the legitimate 
one. 

Similarly to Expression ( [20| ) in the previous section, a policy 
p, can be defined as 

F = {Pe,g = [pi-,e,g,---,PN-e,g] € ^<(e), Vc GS, Vg G Q}, 

(31) 


and similarly for Pf. g represents the transmission power 
used in state (e, g) (since h is unknown, it cannot be included 
in the state of the system). We remark that p performs a power 
control mechanism, i.e., a policy specifies only the transmis¬ 
sion power pe^g. However, in addition to power control, in 


every slot also the code rate can be changed (see Section II-B i. 
In particular, variable rate coding provides higher secrecy rates 
than constant rate coding, but is more difficult to implement. 
In the following we analyze both these approaches]^ 

1) Constant Rate Coding: The simplest assumption is that 
the coding scheme has constant rate and its choice only 
depends on the overall channel statistics. Using constant rate 
coding, the eavesdropper is able to gather more information 
than the legitimate receiver when its channel is better. Because 
of this, for some r, we may have (see Equation Q) 


A 0. 


(32) 


Note that in ( |33| l we integrate both positive and nega¬ 
tive terms. The negative terms are due to the fact that the 
eavesdropper’s channel may be better than the legitimate one 

{hr > gA- 

We now want to extract some properties of the optimal 
secrecy policy in this context. We start by performing the 
following computations, which will be used to extend the first 
point of Theorem 

The channel memoryless property can be used to sim¬ 
plify ( [33] l and recast the problem using an MDR By integrating 
over h, we obtain 


Emax r ^ ^ 

Cg=J2 / ^J2^AA9r,Pr-,e,g) foA9r) dp■ 

e=0 r=l r=l 

(34) 

Tr{9,p) = l^ log2 /rr,(/l) d/i. (35) 


The function Tr°’^{g,p) is presented in Equation ( [36l l, 
where Ei( 2 :) = — JA exponential integral 

function and Si, U are constants]^ 


1 

p) = log2(l + 9P) + {pAy~ 




(36) 


A secure transmission can be performed only if > 0. 
The maximum of ( [34l i can be found with an MDP approach, 
where the MC state is given by the pair {e,g). 

A property, that directly follows from the definitions of 
T^°'^{g,p), is the following. 


Proposition 4. If for p > D we obtain Tr°’^{g,p) < 0, then 
allocating a power p over sub-carrier r is strictly sub-optimal. 


This result is intuitive. Indeed, if Tr°''{g, p) < 0 and p > 0, 
then in ( |34l l we are adding negative terms. This is clearly sub- 
optimal because it lowers the secrecy rate and wastes energy 
at the same time. 

Even if T““(p, p) has a complicated expression, as we will 
see, we are interested in its double derivative with respect to 
g and p: 


dpdg 


Tri9,P) 


1 1 
log 2 {1-GgpA' 


(37) 


We now show that even with partial CSI the optimal secrecy 
policy increases with the legitimate channel gain. As for 


^Differently from the complete CSI case of Section [iv] pr cannot be set 
to 0 if gr < fir (see Equation 125) ), thus using constant rate or variable rate 
coding leads to different results. 


^Closed form expressions for si and t* can be derived but are quite 
complicated. Moreover, we will see that they do not contribute to our next 
results. 










the following result can be used to prune the action 

Theorem 4. Consider N = \. With partial CSI, the transmis¬ 
sion power of OSP is non-decreasing with g (we omit the “1” 
subscripts). Formally, if g” > g', then > Pg°*/ . 

Proof: The proof follows the same steps presented in 
Appendices To prove the theorem the key point is 

that 

^Tr{g,p)>0- (38) 

dpdg 

Note that, considering the derivative with respect to p, it 
follows from @ that pB)--§^T^°’^{g, pa) > 0,/or 

PA < Pb- can rewrite the inequality as -^{T^°'^(g, ps) — 
{dT Pa)) > 0 and obtain 

Trig+ pa)-T rig, pa) 

<Trig + A,pB)-Trig,PB), 


Theorem 1^ 
space 1^ 


VA > 0 and pA < Pb- This condition can be replaced with 
Equation in Appendix to prove the theorem. ■ 

2) Variable Rate Coding: Better performance can be ob¬ 
tained with variable rate coding (see Equations (|^ and 
In this case, in every slot, the code rate is matched to the 
legitimate channel rate. Thus, even if gr < Hr (eavesdropper’s 
channel is better), the eavesdropper can gather at most 
bits (legitimate transmission rate) and not (eavesdropper’s 
transmission rate). The secrecy rate expression is 


C ^ ^ TTptot (e) 
e=0 



/ 1 “t” grPr;e,g 
\ 1 + f’^rPr;e,g 


N 

X n {fGrigr)fHrif^r)'j dg dH, 

r—1 


+ 


( 40 ) 


As before, we introduce a function Pr;e,g) such that 


N 


N 


e=0 


C'm = i^J2'^r‘'’'igr,Pr-,e,g) H fOridr) dg. 

+ r—1 r—1 

( 41 ) 

/I -Cnn\l + 

l0g2 


Trig,p)= f 

J K-l 


1 + <7P 
1 + lip 

= 


fuMdri ( 42 ) 

fnr ) dH. ( 43 ) 


In Equation ( |43| l we integrate from zero to g, thus we 
remove the [•]+ notation (see the structure of Equation Q 
with variable rate coding). 

Note that Trig,p) > Trig,p), which justihes the fact 
that the achievable secrecy rate with variable rate coding is 
higher than with constant rate coding. 

The analogous of Theorem holds in this case, as can be 
proved by exploiting the structure of the double derivative of 


^We provide a formal proof only for the case = 1 because, even if 
theoretically possible, the proof for a generic At > 1 is not analytically 
tractable (see the related discussion just after Theorem |^. 



Battery level 

Figure 2: Transmission power p/* as a function of the battery level e for 
several values of Fi and g S [0.41, 0.51). 


Trig,p)- 


dpdg 


Trig,p) 


where r(m, z) = 
function. 


1 

log 2 (l+pp)2r(m) 

e-tpm-i jg jjjg incomplete 


( 44 ) 

gamma 


B. No Channel State Information 

Lower secrecy rates are obtained when also the legitimate 
receiver’s channel is unknown. In particular, the transmission 
power cannot be adapted to the current channel state. It is easy 
to show that can be greater than zero only if fr > hr for 
some r. However, the mean values of the channel gains are 
not controlled by the transmitter (they are physical quantities), 
thus if the legitimate channel is (statistically) worse, no secrecy 
can be achieved. 


VI. Numerical Evaluation 

In this section we discuss how the secrecy rate changes as 
a function of the different system parameters. 

We compare the following scenarios: OSP with full CSI 
(OSP-EULL), OSP with only legitimate channel knowledge 
and constant rate coding (OSP-PAR-CON) or variable rate 
coding (OSP-PAR-VAR) and OSP with only statistical channel 
knowledge (OSP-STAT). 

If not otherwise stated, the simulation parameters are: 
Smax = 30, truncated geometric energy arrivals with hmax = 6 
and 6 = 1, n = 15 quantization intervals (see Section 
TV = 1 (single sub-carrier), g = h = 1 (symmetric scenario), 
Q = di = M+ with m = 1 (Rayleigh fading). After showing 
results for this choice of parameters, we study the sensitivity of 
the system performance by changing one or more parameters 
while keeping the others hxed. 
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Battery level 


Figure 3: Steady-state probabilities Tr^tot (e) as a function of the battery level 

e. 



Figure 4: Secrecy rate as a function of the battery size emax in the case 
of symmetric channel conditions. 


1) Fixed Parameters : Figure [^show's the optimal transmis¬ 
sion power n as a function of the battery level e when 
g e [0.41,0.51) and fi G K_|_. We recall that, when Q = 
= K+, we use the technique explained in Section III-B 


i.e., we have a finite number of points where the transmission 
power is computed (n = 15). When fi > 0.51, the transmission 
power is identically zero because the eavesdropper is always 
advantaged. Also when fi G [0.41,0.51) the transmission 
power is zero. This is not obvious a priori and strongly 
depends upon the considered interval of g. It can be seen 
that Theorem holds, i.e., Pe°|/i increase with /i. 

Finally, we note that the behavior of the transmission power 
is not obvious a priori, e.g., it is significantly different from a 
simple greedy policy n = e) even when fi is low. 

Figure instead, shows the steady-state probabilities as 
a function of the energy level e, for fixed e„iax and in 
the different scenarios. In all cases, the curves are similar. 
This is because the device tends to operate in an efficient 
region, i.e., approximately at emax/2. This is in order to avoid 
energy outage and overflow, that degrade the performance of 
the system. When e approaches Cmax. the steady-state tails 
increase because of the overflow (when the battery is almost 
full, all harvesting events leading to overflow contribute to 
increasing the steady-state probability of state Cmax, which is 
then higher than those of the immediately lower states). 

2) Battery Size: In Figure we show the rate achieved by 
the various policies as a function of the battery size Cmax. We 
use Rayleigh (m = 1) and a general Nakagami fading with a 
strong Line of Sight (LoS) component {m = 5). The curves of 
OSP-STAT are identically zero because g = h. As expected, 
OSP-FULL has the highest secrecy rate for every value. It can 
be seen that the curves saturate after a certain value. This is 
due to the combination of two effects: 1) the harvesting rate 
of the EHD is limited (it can be shown that the performance 
of an EFl system is bounded) and 2) the achievable secrecy 
rate always saturates in the high power regime (because of the 
structure of Equation Q). Note that the curves saturate already 
for small Cmax, therefore, in practice, it may be sufficient to 



Figure 5: Secrecy rate as a function of the battery size emax in the case 
of asymmetric channel conditions and Rayleigh fading. 


use small batteries to obtain high secrecy rates. 

In | |29l Section IV-B] the authors showed that, when the 
transmission is subject to an average power constraint, the 
performance of the optimal transmission scheme with vari¬ 
able rate coding and partial CSI knowledge approaches the 
performance of the full CSI case when the transmission power 
is sufficiently high. In our previous example, OSP-PAR-VAR 
does not achieve OSP-EULL when Cmax increases because an 
energy harvesting system imposes an average power constraint 
It can be verified that, when b increases, if the battery 
size is sufficiently large, the gap between OSP-PAR-VAR and 


*This can be easily derived starting from the causality constraint 
K N K-l 

EE Vit' = 0,1,... (45) 

k=0r=l k=0 

where, according to Equation is the transmission power over sub- 

carrier r in time slot k, is the amount of energy harvested in slot k and 
Ei^) is the amount of energy initially available in the battery. In the long 
run, the right-hand side becomes the power constraint of our system. 















































Figure 6: Secrecy rate C(j as a function of the number of sub-carriers N. 


OSP-FULL is smaller. 

Note that the achievable secrecy rates strongly depend 
upon the fading statistics. With m = 5, we have strong 
LoS components, i.e., the channel pdfs tend to be narrow 
around their means (g = h). It follows that the legitimate 
and eavesdropper’s channel gains are close to each other most 
of the time. This corresponds to low values of Rg^^p^{pr), thus 
a low secrecy rate. With Rayleigh fading, instead, exploiting 
channel diversity allows to obtain higher rewards. This is 
also the reason why, with Rayleigh fading, full channel state 
information (OSP-FULL) provides a great improvement with 
respect to the partial knowledge cases. 

Figure is similar to the previous one but with asymmetric 
channel gains. When the eavesdropper is advantaged (g = 1, 
h = 2), even if low performance can be achieved, secret 
transmission is still possible. When OSP-PAR-CON is used, 
it is likely that EHD transmits even when the eavesdropper’s 
channel is better and in this case, from Equation the 
secrecy rate is lower. This effect is emphasized if the eaves¬ 
dropper’s channel is advantaged, because it is more likely that 
the legitimate channel is the worse of the two. 

On the other hand, if the legitimate channel is better (g = 2, 
h = 1), the secrecy rate can reach high values. In this case, 
OSP-STAT is also considered and, as expected, is the worst 
among the optimal policies. 

3) Number of sub-carriers: When = 1, hnding the 
optimal policies for high values of n (hne quantization of 
the channel gains) is feasible. We recall that the number of 
states of the MC is directly proportional to the number of 
possible combinations of channel gains. Thus, with N = 1, 
the possible combinations are n x n (legitimate channel x 
eavesdropper’s channel). With a generic N, the combina¬ 
tions become x . Thus, the number of states grows 
exponentially with the number of sub-carriers, making the 
optimization process for high N infeasible in practice (curse- 
of-dimensionality). Even when the problem symmetry can be 
exploited (when Gr and Hr are i.i.d.), the computational effort 
still remains heavy. In pratice, this approach can be applied 



Eavesdropper’s BAD channel probabiHty 


Figure 7: Secrecy rate of OSP-FULL as a function of the eavesdropper’s 
BAD channel probability in a binai^ channel system. 


to multi-carrier scenarios if the number of carriers, N, and 
the number of quantization levels for the channel, n, are 
not too large. Note however that our solution suffers from a 
dimensionality problem because it is the optimal solution. Part 
of our future work agenda includes the design of sub-optimal 
schemes and the study of trade-offs between computational 
times and performance. 

In the following, as an example, we consider a discrete 
GOOD-BAD channel and discuss the importance of the 
power splitting scheme. We dehne Q = = {B,G} = 

{1/30,3/30} = (-15 dB,-10 dB) with probabilities 0.7 
and 0.3, respectively. We also set e„iax = 10 because, 
generally, the saturation region is almost reached for this 
battery size (see Eigures and |^. In Eigure we plot OSP- 
EULL as a function of the number of sub-carriers N when 
the optimal (Equations (|25|)-(|2^) or a uniform power splitting 
is used. In the optimal case, as N increases, the reward also 
increases. This is expected because, when one user experiences 
a bad channel condition, then the power can be directed to 
other good sub-carriers. Instead, with uniform power splitting, 
the secrecy rate decreases with N. In practice, this happens 
because, instead of sending all the transmission power in the 
“good” sub-carriers, a fraction of this is wasted in the “bad” 
sub-carriers. Eor example, with W = 2, it may happen that 
over sub-carrier 1 the pair legitimate-eavesdropper’s channel 
gain is {G,B) whereas, for sub-carrier 2, the pair is 
i.e., sub-carrier 1 is a “good” sub-carrier while sub-carrier 2 is 
not. In this case, if a positive transmission power were used, 
the corresponding reward would be greater than zero but the 
power sent over sub-carrier 2 would be wasted (only when the 
two pairs are {G,B) and {G,B), is no power wasted during 
the transmission). This explains why the performance degrades 
as the number of sub-carriers increases. Moreover, the effect is 
emphasized with larger N because there are more cases where 
the transmission power cannot be fully exploited. 

When the legitimate and the eavesdropper’s channel gains 
are known in every slot, using a smart power splitting scheme 
is convenient because it can signihcantly improve the network 





















performance. If this is not possible (e.g., because this infor¬ 
mation is not available or not reliable), a sub-optimal strategy 
needs to be adopted, e.g., uniform power splitting, which is 
simpler to implement but yields lower performance in general. 
The study of the information/performance tradeoff for power 
splitting strategies is left for future work. 

Finally, Figure [7] shows how the optimal secrecy rate 
changes as a function of P(hi = B) = P(/i 2 = B) G [0,1] 
for different numbers of sub-carriers. It can be noticed that 
the case with hve sub-carriers and P(hi = B) = 0.2 
achieves the same performance as the system with only one 
sub-carrier but P(/ii = B) = 1. In practice, the diversity 
offered by a greater number of sub-carriers can be efficiently 
exploited to obtain higher secrecy rates. An interesting point 
is that, as N increases, the improvement obtained from N to 
iV -f 1 decreases. This is due to the concavity properties of 
Equation ([^. Therefore, it may not be necessary to use a large 
number of sub-carriers to obtain high secrecy rates. 

VII. Conclusions 

In this work we analyzed an Energy Harvesting Device 
that has a hnite energy storage and transmits secret data to 
a receiver over N parallel channels exploiting physical layer 
characteristics. We found the best power allocation technique, 
namely the Optimal Secrecy Policy (OSP), in several contexts 
depending on the degree of channel knowledge the device has. 
We proved several properties of OSP and in particular that it is 
deterministic and monotonic. We also described a technique to 
compute OSP by decomposing the problem in two steps and 
using a dynamic programming approach. When only partial 
channel state information is available, we described how the 
maximum secrecy rate varies with constant and variable rate 
coding, explaining and numerically evaluating the advantages 
of variable rate coding. We numerically showed that, because 
of the limited harvesting rate that is inherently provided by the 
renewable energy source, OSP-PAR-VAR does not achieve the 
same performance of OSP-EULL as the battery size increases, 
and noted that it is not necessary to use very large batteries 
to achieve close to optimal performance. We also set up the 
problem when more than one sub-carrier is considered, and 
discussed the scalability problems related to such scenario. 
Also, we found that using the optimal power splitting scheme 
provides a signihcant advantage with respect to the simpler 
uniform splitting approach. 

Euture work may include the study of sub-optimal strategies 
for the case with N sub-carriers in order to avoid the curse-of- 
dimensionality problem. Also, other optimization techniques 
can be investigated, e.g., offline approach, Lyapunov optimiza¬ 
tion or reinforcement learning approach. Einally, it would be 
interesting to set up a simulation experiment with real data 
measurements (e.g., for the harvesting process) in order to 
validate our results in a realistic scenario. 

Appendix A 
Proof of Theorem[T] 

We want to show that OSP is a deterministic policy, i.e., 
given the state of the system, ^{p]e,g,fl) = 6p,p* where 


(5. . is the Kronecker delta function|^ 

Note that the study can be split into two parts ac¬ 
cording to Equation GD- Thus, we only need to prove 
that both (transmit power policy) and 

(pp{p-, ,e, g,fi) are deterministic. In the following we 

prove the first part. The latter is derived in |29|. 


A. Deterministic Transmit Power Policy 

As a preliminary result, we need the following proposition 
(in this subsection, the expectation is always taken with respect 
to G and H). 

Proposition 5. = e\E^^'l) depends upon the policy 

only through e, G, iT)], G {0, ...,e}, Ve G 

g. 

Proof: The proof is by induction on k. At k = 0, = 

= Co) is equal to 1 if e = Cq and to 0 otherwise. In 
this case there is no dependence upon the policy. 

Assume that the thesis is true for k (inductive hypothesis). 
Using the chain rule, the probability that = e' given 

the initial state is 

P(^(/C+1) ^ P(F;('=+i) = e'\E^^'> = e) (46) 

e^O 

X P(L;('=) = e|L;(°)). 

Thus, to prove the thesis, we focus on = 

g'l^(fc) = whereas for we use the 

inductive hypothesis. Assume e! < Cmax 

P(£;(fc+1) ^ ^ g) (47) 

min{e',&n,ax} 

^ PB((>)E[7M(e-e' + 6;e,G,ff)], 

6=max{0,e' — e} 

whereas, if e' = Cmax 

P(^(fc+1) ^g^axl^;^"^ =e) (48) 

^max 6 fimax -i-t> 

E E E[ 7 ^(d;e,G,ff)]. 

6=max{0,emax —e} d—0 

Note that we used the transmit power policy 7 ^(-) and not 
the power allocation policy p{-). Indeed, the battery evolution 
does not depend upon the particular power splitting scheme 
but only on the total energy consumed. Thus, = 

g'1 7 ^( 0 )) depends upon the policy only through the expectations 
E[7^(pt°‘;£;W,G,/T)]. ■ 

Define now the long-term probabilities of being in the 
energy level e given the initial level E^^'l as Tr{e\E^^'>) = 
liminfB-_>.oo Thanks to the 

above proposition, we know that Tr{e\E^^'>) depends upon the 
policy only through E[ 7 ^(p*°*; e, G, H)], G {0, ..., e}, 
VeG g. 

Eix a value e) for every pair and e, and consider 

the set of policies S that induce E[ 7 ^(p*°‘; e, G, iT)] = 

proof of this result in the discounted horizon case can be found in |43| 
Theorems 6.2.9 and 6.2.10]. In our discussion we follow a different approach 
which will also be useful to prove Proposition 



a(/r?*°*; e) for every pair. For every policy in 5, the long-term 
probabilities are the same. The long-term average secrecy rate 
given an initial state can be expressed as in Equation ([T6|) 


In order to maximize ( |5^ , we can focus on each argument 
of the expectation 




(49) 




max 

Vp*°*G{0,...,e} 


ptot^{0 ___ e} 


(55) 


X E 

^ p{p-,e,G,H)c{p,G,H) 

X g, h) - A(p‘°‘; e)) 


pefp< (e) 



For every policy in the terms Tr{e\E^^'>) of the previous 
expression are the same. Therefore, in order to maximize 
Cf,{E(°y), we focus on the terms E[-] for each value of e. In 
particular, the problem can be decomposed in Cmax + l simpler 
optimization problems (according to (m, define p(e) = 

yg eQ, he 3{}) 


max E 

p(e) 


E 

^P6^<(e) 


g{p;e,G,H)c{p,G,H) , 


(50a) 


Constraints in ( |T5] l; (50b) 

E[74p‘°‘; e, G, H)] = e), G {0,..., e}. 

(50c) 


We rewrite the first expression as follows 


max E 

p(e) 


5] 7p(p*°‘;e,G,fT) 

'-p‘°‘G{0,...,e} 

pes>=ip*°*) 


( 51 ) 


where = {p : p h 0, = Ef=i Pr}- 

As derived in p9] Eq. 7] with a Lagrangian approach, 

^ is deterministic and 

there is no dependence upon e when is fixed). T*tot ^ ^ is 
the optimal transmit power splitting given the total transmis¬ 
sion power and the channel gains (we use r instead of p 
for notation clarity). Therefore, we can rewrite as 


max E 


7p(p‘°‘; e, G, G, H) 


p*“‘G{0,...,e} 


(52) 


For every fixed e, we want to define 7 p(e) = 
{ 7 p(-; e, yg € Q, fi € IE}. Note that the problem 

is concave, thus a Lagrangian approach can be used. The 
Lagrangian function is 


X(e) = E 


^ 7M(p‘°‘;e,G,JT) 

p*°*G{0,...,e} 


(53) 


X (c(T*tot_G jj, G, iT) — A(p‘°*; e)^ 


where A(p*°*;e) is the Lagrange multiplier associated with 
constraint E[ 7 ^(p*°t; e, G, H)] = e). 

We now show that an optimal policy is 7 ^(p‘°*; e, g,/l) = I 
if n* and zero otherwise, with 


tot ^ 


arg max ] c(rAot S', ^) - A(p*°‘; e) L (54) 

p‘°tG{0....,e} '■ 


We recall that Eptote{o,...,e} 7 p(p‘°*; e, s, = 1. @ is 
a weighted sum that is maximized when 7^(p*°‘; e, s, = 
1 if f* and zero otherwise. Indeed, sup¬ 

pose by contradiction that there exist and P 2 °* (the 
argument can be generalized to more than two) such 
that 7^(pf*;e,s,^) > 0, 7^(p^°*; e, s,> 0 and 

7M(p*i°*;e,s,^) + 7p(p*2°*;e.P,^) = 1- The max argu¬ 

ment in (|^ would be 7^(p‘°*; e, s, ^)u(pi°S e, s, + (1 - 
7M(Pi°*j 6, Si 6, Si ^)i which is smaller than or equal 

to tt(p*°‘^^*,e,s,^). 

Appendix B 

Proof of Proposition[T] 

The MC has three dimensions: the battery, the legitimate 
channel and the eavesdropper’s channel. Since the fading is 
not controlled by the EHD, the MC is always free to move 
along the last two dimensions (we assume that the channel 
evolution is i.i.d. over time). Thus, the only potential problem 
is related to the battery dimension, i.e., if the policy is not 
unichain, the device energy level may be stuck in different 
subsets of (S. 

Also, we recall that we consider only discrete channel 
conditions with non-zero probability (Remark [B- We now 
discuss Point 1). We want to show that the recurrent class is 
composed by the states with high energy levels, i.e., for every 
e < Cmaxi there exists a positive probability of increasing the 
energy level. This is true by hypothesis because the maximum 
transmit power in state e is lower than the maximum number 
of energy arrivals 6max (Pe°^' h' ^ (’max)- Therefore, since it is 
possible to reach the energy level Cmax (fully charged battery) 
within a certain number of steps from every state, the policy 
is unichain. To prove Point 2), a symmetric reasoning can be 
followed. 

If both conditions hold, it is possible to reach every e G <S 
from any element of &, thus the policy induces an irreducible 
MC. Since the number of states is finite, the MC is positive 
recurrent. 


Appendix C 

Deriving a Unichain Policy 

As in Appendix it is always possible to move along 
the channel dimensions. Therefore, we focus on the battery 
dimension, which represents the only limitation for obtaining 
a unichain policy. 

Consider a policy pA that has two recurrent classes, namely 
and (this approach can be generalized to more than 
two classes) and assume, without loss of generality, that if 
^(0) g jj" (jjg greatest long-term reward is reached. We now 













propose a technique to derive a new policy that, regardless of 
the initial state, achieves the same maximum reward of 

Consider a second policy, namely /rs, obtained from 
as follows. For every ca = 0,..., max{n^}, set g = 
PeA,gA’ CB = eA + Bmax “ maxe{n^}, i.e., we shift the 
recurrent class 11'^ toward higher energy levels (we name li'g 
the new recurrent class). For es G {0,..., emax —maxlll^^} — 
1}, set g fi = 0. In this way, the device cannot be stuck 
in energy levels lower than Cmax — |n^| + 1 (the harvested 
energy increases the battery level) and, after a certain number 
of transitions, it reaches the recurrent class 11^. Finally, since 
the power splitting vectors in the recurrent classes 11^ and 
coincide, ps achieves the same maximum reward of pA, 
regardless of the initial 

This proves that it is always possible to obtain a unichain 
policy with the same maximum long-term secrecy rate as the 
initial one and shows how to derive it. 


Appendix D 
Proof of Theorem|2] 

Problem ( [T^ can be rewritten using ( |2T] i in the following 
form: 


max Cn = max max (56) 

= {p : and p are consistent}, (57) 


i.e., we hx the transmission powers (outer max) and focus on 
all the policies which are consistent with such choice (inner 
max). This is equivalent to searching through all the possible 
feasible policies (as in ([T^). 

Consider the expression of Cg in Equation ( |22l l and note that 
TTgtot (e) does not depend upon the particular power splitting 
scheme, but only upon Thus, the inner max can be moved 
inside the integral 


max ( } 7r„tot (e) (58) 

utot \ 

^ ^ e=0 

^ max 

Note that inside the integral e, g and ^ are hxed. Therefore, 
the only degree of freedom in the inner max operation is given 
by the power splitting choice Pe,g,ti- 

Since and p are consistent, in the inner max we have 
Pe,gfi G h) (specified in (|2^). Therefore, 


max 


{c{pe,g,ll,gA) 


= Problem ( [23] l with x = 

(59) 


Thus, Points 1) and 2) of the theorem solve the internal and 
external max operations, respectively. 


Appendix E 

Proof of Proposition[3] 

The proof exploits the results of Appendix and in 
particular Equation 0. Also, we focus on the energy levels 
in the unique recurrent class (for the transient states the 
proposition is trivial to prove since is always zero). 


Assume that is the optimal transmission 

power given the state of the system {e,g',fi'), i.e., = 

argmaXpt„tg{o - A(p*°*;e)} (we re¬ 

mark that T*tot g, f^i is the optimal power splitting vector given 
^tot jjjg channel gains). Similarly, ^// is the 

optimal power for state {e,g'\h"). 

We first show by contradiction that if 


D{p^°^]g',h'-g",h”) > 0,Vp*°*, then p*' 


> P' 


tot^ 


Assume . We now derive some properties of 

and p*‘°^ and combine these with the hypothesis to 
obtain the contradiction. Erom the definitions of and 

^tOt ^ Jj^yg 




,tot" _// 


,g"fi 


> ^^, i,,,g',Fi') - A(p‘°* ;e), 


By hypothesis, we have, for every 


(60) 

( 61 ) 




(62) 


Assume that the inequality is strict. This implies, for every 

Pa < Pb 

c{^PA,g"fi"^9"^ ^") - <^PA,g'AAd'^ ^') 




(63) 


In particular, since , choose pA = and 

and obtain 


PB = P 
c{t 


,g"fi 




Einally, by combining (|M]l with (|64li, we obtain 


(64) 


^ ^('’’ptat' gii nmg ) ~ gt fin g ) 


(65) 


+ c(t},tot" S' )l 


( 66 ) 


,g' A' 

which is equivalent to 

< - A(p‘°‘";e), 

and violates Equation ( |60l l, leading to a contradiction. 

Assume now that ( |62| ) holds with equality. Eollowing the 
previous reasoning, we obtain 

and, instead of ( |6^ , 

- A(p‘°*';e) 

< c{t*„ , ung'A') - A(p‘°‘";e), 


(67) 


( 68 ) 



( |68l l must be satisfied with equality, otherwise it would 
violate ( |60l l. This means that, for the same state 
there exist two distinct values of (i.e., and ) that 
maximize •HI. This is not possible because in the recurrent 
states the optimal solution is unique | [42| Vol. II, Sec. 4]. 

The first point of Proposition is thus proved. The proof 
of the second point is symmetric. 


Appendix F 
Proof of Theorem[3] 

We want to prove that, for OSP and N = 1, ^ does not 

decrease with g and does not increase with fi. 
D{p^°^;g',/i';g",fi") can be written as 

D{p^°^-g'Ji-g"Ji) (69) 


d 

dptot 



1 + 

1 + fl 



i + py°t \i+\ 
i + /!p‘°V_ y 


Assume g" > g'. If g” < fi, then both terms are zero 
because g' < g" < fi. If g' < fi < g", then only the right 
term is zero. In this case, D{p^°^; g',li; g”,fi) cx g” — fi > 0. 
If fl <g' < g”, then cx g” - g' > 0. 

The proof of the second part is similar. 
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